Search CORE

2 research outputs found

Bringing UMAP Closer to the Speed of Light with GPU Acceleration

Author: Lafargue Victor
Nanditale Thejaswi
Nolet Corey J.
Oates Tim
Patterson Joshua
Raff Edward
Zedlewski John
Publication venue
Publication date: 29/03/2021
Field of study

The Uniform Manifold Approximation and Projection (UMAP) algorithm has become widely popular for its ease of use, quality of results, and support for exploratory, unsupervised, supervised, and semi-supervised learning. While many algorithms can be ported to a GPU in a simple and direct fashion, such efforts have resulted in inefficient and inaccurate versions of UMAP. We show a number of techniques that can be used to make a faster and more faithful GPU version of UMAP, and obtain speedups of up to 100x in practice. Many of these design choices/lessons are general purpose and may inform the conversion of other graph and manifold learning algorithms to use GPUs. Our implementation has been made publicly available as part of the open source RAPIDS cuML library (https://github.com/rapidsai/cuml)

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

cuSLINK: Single-linkage Agglomerative Clustering on the GPU

Author: Doijade Mahesh
Eaton Joe
Fender Alex
Gala Divye
Nolet Corey J.
Oates Tim
Raff Edward
Rees Brad
Zedlewski John
Publication venue
Publication date: 28/06/2023
Field of study

In this paper, we propose cuSLINK, a novel and state-of-the-art reformulation of the SLINK algorithm on the GPU which requires only

O(Nk)

space and uses a parameter

k

to trade off space and time. We also propose a set of novel and reusable building blocks that compose cuSLINK. These building blocks include highly optimized computational patterns for

k

-NN graph construction, spanning trees, and dendrogram cluster extraction. We show how we used our primitives to implement cuSLINK end-to-end on the GPU, further enabling a wide range of real-world data mining and machine learning applications that were once intractable. In addition to being a primary computational bottleneck in the popular HDBSCAN algorithm, the impact of our end-to-end cuSLINK algorithm spans a large range of important applications, including cluster analysis in social and computer networks, natural language processing, and computer vision. Users can obtain cuSLINK at https://docs.rapids.ai/api/cuml/latest/api/#agglomerative-clusteringComment: To appear in ECML PKDD 2023 by Springer Natur

arXiv.org e-Print Archive